A Distributed Feature Selection Approach Based on a Complexity Measure

نویسندگان

  • Verónica Bolón-Canedo
  • Noelia Sánchez-Maroño
  • Amparo Alonso-Betanzos
چکیده

Feature selection is often required as a preliminary step for many machine learning problems. However, most of the existing methods only work in a centralized fashion, i.e. using the whole dataset at once. In this paper we propose a new methodology for distributing the feature selection process by samples which maintains the class distribution. Subsequently, it performs a merging procedure which updates the final feature subset according to the theoretical complexity of these features, by using data complexity measures. In this way, we provide a framework for distributed feature selection independent of the classifier and that can be used with any feature selection algorithm. The effectiveness of our proposal is tested on six representative datasets. The experimental results show that the execution time is considerably shortened whereas the performance is maintained compared to a previous distributed approach and the standard algorithms applied to the non-partitioned datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Framework for Distributed Multivariate Feature Selection

Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...

متن کامل

A multi agent method for cell formation with uncertain situation, based on information theory

This paper assumes the cell formation problem as a distributed decision network. It proposes an approach based on application and extension of information theory concepts, in order to analyze informational complexity in an agent- based system, due to interdependence between agents. Based on this approach, new quantitative concepts and definitions are proposed in order to measure the amount of t...

متن کامل

Improvement of effort estimation accuracy in software projects using a feature selection approach

In recent years, utilization of feature selection techniques has become an essential requirement for processing and model construction in different scientific areas. In the field of software project effort estimation, the need to apply dimensionality reduction and feature selection methods has become an inevitable demand. The high volumes of data, costs, and time necessary for gathering data , ...

متن کامل

Measurement of Complexity and Comprehension of a Program Through a Cognitive Approach

The inherent complexity of the software systems creates problems in the software engineering industry. Numerous techniques have been designed to comprehend the fundamental characteristics of software systems. To understand the software, it is necessary to know about the complexity level of the source code. Cognitive informatics perform an important role for better understanding the complexity o...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015